The importance of segmental duration and f0 for generating more natural intonation in synthetic speech

ثبت نشده

چکیده

This dissertation presents the importance of diphones’ duration and f0 information in generating more natural intonation in unit selection speech synthesis. The results showed that diphones’ duration or f0 information was highly correlated to one another due to the prosodic properties inherited from the recorded human speech. Also only raising the importance of duration and f0 information largely resulted in more natural intonation in the synthetic speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the perception of "segmental intonation": F0 context effects on sibilant identification in German

In normal modally voiced utterances, voiceless fricatives like [s], [ʃ], [f], and [x] vary such that their aperiodic pitch impressions mirror the pitch level of the adjacent F0 contour. For instance, if the F0 contour creates a high or low pitch context, then the aperiodic pitch impression of the fricative in this context will also be high or low. This contextmatching effect has been termed “se...

متن کامل

Intonation modelling with a lexicon of natural F0 contours

We describe a new approach for generating Norwegian intonation in text to speech synthesis. The method is based on a phonological representation of utterances. The overall f0 contour of an utterance is synthesised by concatenation of stored f0 contours corresponding to accent units. Candidate accent units are found by searching a lexicon derived from natural speech and selecting the unit that i...

متن کامل

Generating natural F0 trajectory with additive trees

In HMM-based TTS, while the segmental quality of synthesized speech is quite acceptable, intonation, especially at the sentence level, tends to be somewhat bland. The maximum likelihood (ML) criterion used in HMM training and parameter trajectory generation is partially responsible for the blandness. Additionally, the F0 trajectory thus generated has a smaller dynamic range than that of natural...

متن کامل

Applying a Hybrid into Seamless Speech

We present a speech synthesizer to seamlessly concatenate recorded and synthetic phrases to produce natural sounding and highly expressive speech. Not only the acoustic units, but also the F0 contours are seamlessly concatenated together from recorded and synthetic phrases. When mixed with recorded phrases, the F0 contours of synthetic phrases are generated adaptively relative to the actual sur...

متن کامل